Exploratory Data Analysis for Complex Models
نویسندگان
چکیده
“Exploratory” and “confirmatory” data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model. For example, many of Tukey’s methods can be interpreted as checks against hypothetical linear models and Poisson distributions. In more complex situations, Bayesian methods can be useful for constructing reference distributions for various plots that are useful in exploratory data analysis. This article proposes an approach to unify exploratory data analysis with more formal statistical methods based on probability models. These ideas are developed in the context of examples from fields including psychology, medicine, and social science.
منابع مشابه
A Planning Representation for Automated Exploratory Data Analysis
Igor is a knowledge-based system for exploratory statistical analysis of complex systems and environments. Igor has two related goals: to help automate the search for interesting patterns in data sets, and to help develop models that capture significant relationships in the data. We outline a language for Igor, based on techniques of opportunistic planning, which balances control and opportunis...
متن کاملApplication of C-A fractal model and exploratory data analysis (EDA) to delineate geochemical anomalies in the: Takab 1:25,000 geochemical sheet, NW Iran
Abstract Most conventional statistical methods aiming at defining geochemical concentration thresholds for separating anomalies from background have limited effectiveness in areas with complex geological settings and variable lithology. In this paper, median+2MAD as a method of exploratory data analysis (EDA) and concentration-area (C-A) fractal model as two effective approaches in separation g...
متن کاملComplex-Valued Data Envelopment Analysis
Data Envelopment Analysis (DEA) is a nonparametric approach for measuring the relative efficiency of a decision making units consists of multiple inputs and outputs. In all standard DEA models semi positive real valued measures are assumed, while in some real cases inputs and outputs may take complex valued. The question is related to measuring efficiency in such cases. As far as we are aware, ...
متن کاملAutomated Analysis of Complex Data * Automated Analysis of Complex Data
Igor is a knowledge-based system for exploratory statistical analysis of complex data. Igor has two related goals: to help automate the search for interesting patterns in data sets, and to help develop models that capture significant relationships in the data. Igor relies on a planning representation to manage the complexities of data analysis. We outline the relationship between planning and E...
متن کاملA Methodology for Establishing Information Quality Baselines for Complex, Distributed Systems
We introduce a methodology for improving information quality for complex, distributed event based systems and apply this methodology to an electronic payments system. The methodology consists of five integrated activities: 1) Exploratory data analysis to identify key features of the data. 2) Developing analytical models that detect statistically significant changes from baselines for data field...
متن کامل